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(57) Digital sound source data is stored in a sound 
source data memory 22. When a first display object (an 
enemy character, a waterfall, or the like) so defined as 
to generate a sound is displayed in a three-dimensional 
manner on a display screen 41 of a television 40, a 
audio processing unit 12 reads out the corresponding 
sound source data from the sound source data memory 
22, to produce first and second sound source data. The 
first and second sound source data are converted into 
analog audio signals by digital-to-analog converters 1 6a 
and 1 6b, and are then fed to left and right speakers 42L 
and 42R. At this time, the audio processing unit 12 cal- 
culates delay time on the basis of a direction to the first 
display object as viewed from a virtual camera (or a 
hero character), and changes delay time of the second 
sound source data from the first sound source data. 
Further, the audio processing unit 12 individually con- 
trols the sound volume levels of the first and second 
sound source data depending on the distance between 
the first display object and the virtual camera (or the 
hero character). Consequently, sounds having a spatial 
extent corresponding to the change of a three-dimen- 
sional image can be respectively generated from the left 
and right speakers 42L and 42R. 
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Description 

BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention relates generally to a sound 
generator synchronized with image display, and more 
particularly, to a sound generator for generating sounds 
such as a sound effect or music having a three-dimen- 
sional extent on the basis of the direction, the distance, 
and the like between two objects displayed in an image 
processor such as a personal computer or a video 
game set. 

Background of the Invention 

As a technique for generating stereo sounds, a 
technique disclosed in Japanese Patent Publication No. 
9397/1985 (prior art 1) has been conventionally known. 
The prior art 1 is for outputting one audio signal as an 
analog signal as it is and outputting the audio signal as 
a delayed analog signal, to produce stereo sounds. 

However, the prior art 1 cannot be applied to a dig- 
ital sound source because the analog signal is delayed. 
Moreover, in the prior art 1, the moving state of an 
object or a character displayed on a screen of a CRT 
display or the like is not considered, whereby such 
sounds as to be synchronized with the movement of the 
object cannot be generated. 

As a sound generator for generating two-dimen- 
sional sounds in relation to image display, a sound gen- 
erator disclosed in Japanese Patent Laid-Open No. 
1 55879/1 987 (prior art 2) has been known. The prior art 
2 is for controlling the sound volumes from left and right 
speakers as an airplane which is a moving object to be 
a sound source in a two-dimensional manner by gradu- 
ally decreasing the sound volume from the speaker from 
which the airplane moves farther apart and gradually 
increasing the sound volume of the speaker to which 
the airplane moves closer. 

In the prior art 2, however, the sound volume is only 
gradually decreased or increased as the moving object 
moves. Even if the sounds are heard as stereo sounds, 
therefore, tree-dimensional sound effects are not 
obtained. Further, a sound effect obtained by the prior 
art 2 is not suitable as a sound effect for a three-dimen- 
sional image display. The reason for this is that sound 
effects are obtained if sounds are generated in synchro- 
nization with the movement of a moving object which 
moves in a two-dimensional manner, for example, an 
airplane or an automobile, while three-dimensional 
sound effects cannot be obtained when a three-dimen- 
sional image is displayed, whereby the image display 
and the sound effects (or the presence) do not coincide 
with each other, producing an uncomfortable impres- 
sion on a user. Further, when the prior art 2 is applied to 
a case where the user uses a headphone, right and left 
sounds are simultaneously transmitted to the right and 
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left ears of the user, and the sound volumes of the right 
and left sounds only differ from each other, whereby 
such three-dimensional sound effects as to cause a 
delay between the right and left sounds are not 

5 obtained. In addition, when the difference between the 
sound volumes of the right and left sounds is extremely 
increased, a burden is imposed on the ears of the user, 
whereby the user is easily made to feel fatigue. It has 
been experimentally known that the use of the head- 

70 phone for a long time causes a headache or the like. 

When the sound volumes from the right and left 
speakers are controlled using the prior art 2, the sounds 
generated from the speakers are heard by the user 
while extending in every direction, whereby a slight time 

is delay occurs between the sound generated from one of 
the right and left speakers and transmitted to the left or 
right ear of the user and the sound generated from the 
other speaker and transmitted to the right or left ear of 
the user, so that sound effects similar to those in delay- 

20 ing one of the sounds are slightly produced, while three- 
dimensional sound effects have not been obtained yet. 

SUMMARY OF THE INVENTION 

25 Therefore, an object of the present invention is to 
provide a sound generator capable of generating three- 
dimensional sounds which are changed as a camera is 
moved in a case where a three-dimensional image as 
viewed from the camera is displayed. 

30 Another object of the present invention is to provide 
a sound generator capable of reducing the fatigue of a 
user even when the user hears sounds using a head- 
phone. 

In order to attain the above-mentioned objects, the 

35 present invention has the following characteristics. 

The present invention is directed to a sound gener- 
ator for generating, in an image display device for dis- 
playing a three-dimensional image, sounds having a 
spatial extent corresponding to the three-dimensional 

40 image, which comprises a first sound generation sec- 
tion for generating a first sound, a second sound gener- 
ation section for generating a second sound, a sound 
source data storage section for digitally storing sound 
source data, a temporary storage section for temporar- 

45 ily storing the sound source data read out from the 
sound source data storage section, a delay time calcu- 
lation section for calculating, when the image display 
device displays a first display object so defined as to 
generate sounds, delay time on the basis of a direction 

so to the first display object as viewed from the position of 
a predetermined viewpoint, a audio processing section 
for reading out the sound source data corresponding to 
the first display object from the sound source data stor- 
age section for each unit time, storing the sound source 

55 data in the temporary storage section and reading out 
the sound source data as a first sound source data, and 
reading out the sound source data stored in the tempo- 
rary storage section as a second sound source data at 
timing delayed by the delay time calculated by the delay 
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time calculation section from timing at which the first 
sound source data is read out, a first digital-to-analog 
conversion section for converting the first sound source 
data read out from the temporary storage section into 
an analog audio signal and feeding the analog audio 
signal to the first sound generation section, and a sec- 
ond digital-to-analog conversion section for converting 
the second sound source data read out from the tempo- 
rary storage section into an analog audio signal and 
feeding the analog audio signal to the second sound 
generation section. 

As described in the foregoing, according to the 
present invention, when the first display object is dis- 
played on the three-dimensional image, the delay time 
between the first sound source data and the second 
sound source data is changed depending on the 
change of the direction to the first display object as 
viewed from the position of the predetermined view- 
point, whereby the sounds having a spatial extent corre- 
sponding to the change of the three-dimensional image 
can be generated from the first and second sound 
source generation sections. Since image display and 
sound effects coincide with each other in the present 
invention, therefore, a user can be made to feel image 
and sound effects closer to reality, whereby the interest 
of the user can be further improved. Further, when the 
user hears the sounds generated from the sound gener- 
ator according to the present invention using a head- 
phone stereo, the fatigue of the user can be reduced. 

The foregoing and other objects, features, aspects 
and advantages of the present invention will become 
more apparent from the following detailed description of 
the present invention when taken in conjunction with the 
accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a block diagram showing the construction 
of a sound generator according to a first embodi- 
ment of the present invention; 
Fig. 2 is an illustration for explaining the principle in 
a case where the amounts of delay of left and right 
audio signals are varied on the basis of the posi- 
tional relationship between a sound generating 
object and a camera; 

Fig. 3 is a characteristic view showing the relation- 
ship between the sound volume and the direction in 
a case where the sound volume is controlled under 
the condition that there is no amount of delay; 
Fig. 4 is a characteristic view showing the relation- 
ship between the direction and the sound volume in 
a case where under the condition that there is an 
amount of delay, the amount of delay is variably 
controlled; . 

Fig. 5 is a characteristic view showing the relation- 
ship between the distance and the sound volume in 
a case where the sound volume is controlled under 
the condition that there is an amount of delay; 
Fig. 6 is a characteristic view in a case where the 



amount of delay is controlled in relation to the posi- 
tional relationship between a sound generating 
object and a camera (a hero character); 
Fig. 7 is a block diagram showing the construction 
5 of a sound generator according to a second embod- 

iment of the present invention; 
Fig. 8 is a diagram illustrating a memory space of a 
W-RAM shown in Fig. 7; 

Fig. 9 is a diagram illustrating one example of a 
10 memory map of a sound memory area correspond- 
ing to a buffer memory shown in Fig. 1 ; 
Fig. 10 is a diagram illustrating another example of 
a memory map of a sound memory area corre- 
sponding to a buffer memory shown in Fig. 1 ; 
is Fig. 1 1 is a flow chart showing schematic opera- 
tions of a game; 

Fig. 12 is a flow chart showing the details of a sub- 
routine of audio output processing shown in Fig. 1 1 ; 
Fig. 1 3 is a timing chart showing output of audio 

20 data in a case where there is no amount of delay; 

Fig. 14 is a timing chart showing output of audio 
data in a case where there is an amount of delay, 
and the previous amount of delay and the current 
amount of delay are the same; and 

25 Fig. 15 is a timing chart showing output of audio 
data in a case where there is an amount of delay, 
and the previous amount of delay and the current 
amount of delay are not the same. 

30 DESCRIPTION OF THE PREFERRED EMBODI- 
MENTS 

Fig. 1 is a block diagram showing the construction 
of a sound generator according to a first embodiment of 

35 the present invention. In Fig. 1, an image/audio proces- 
sor (hereinafter referred to as a "processor") 10 is, for 
example, a video game set for generating images and 
sounds such as music and a sound effect for a game, 
and comprises an image processing unit 1 1 and a audio 

40 processing unit 12. An image memory 13 is connected 
to the image processing unit 1 1 through an address bus 
and a data bus. Further, an external memory 20 and an 
controller 30 are detachably connected to the image 
processing unit 11 . 

45 The image processing unit 1 1 performs an opera- 
tion for image processing on the basis of data repre- 
senting the operating state inputted from the controller 
30, and image data and program data which are stored 
in an image data/program data memory 21 in the exter- 

so nal memory 20, and feeds image display data gener- 
ated by the operation to an image signal generation 
circuit 14. Specifically the image processing unit 11 
generates image display data for displaying one or a 
plurality of objects which generate sounds such as 

55 music and/or a sound effect of the game on the basis of 
programs (for example, a waterfall, a river, an animal, an 
automobile, an airplane, or the like) and an object which 
does not generate sounds (for example, a building, a 
plant, a road, a cloud, a scene, or the like) in a three- 
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dimensional manner as the line of sight of a camera is 
moved. The camera is a virtual camera which is not dis- 
played on a screen of a display device. An image pro- 
jected on the virtual camera is displayed on the screen 
of the display device. Further, the line of sight of the 
camera is moved by the progress of the game, the oper- 
ation of a player, and the like. When a hero character 
(for example, a human being or an animal) which 
changes in the direction of movement or the action (the 
movement of the hands and legs) according to the oper- 
ation of the player is displayed on the screen, the line of 
sight of the camera may, in some cases, be moved in 
synchronization with the movement of the line of sight of 
the hero character. 

The image signal generation circuit 14 generates 
an image signal with various synchronizing signals 
required to display an image by a CRT (Cathode Ray 
Cube) or a standard television receiver (hereinafter 
referred to as a "television") 40 which is one example of 
the display device on the basis of the image display data 
fed from the image processing unit 11, and feeds the 
image signal to the television 40, to display an image on 
a display screen (or a display section) 41 . 

Furthermore, the image processing nit 1 1 feeds to 
the audio processing unit 12 coordinate data (hereinaf- 
ter referred to as "first coordinate data") of an object 
which generates sounds (hereinafter referred to as a 
"sound generating object") out of various display 
objects, coordinate data (hereinafter referred to as "sec- 
ond coordinate data") of the virtual camera (or the hero 
character), and data for designating the type of sound 
for the purpose of changing a sound effect such as the 
sound of water such as a waterfall or a river, the sound 
in running of an automobile, or the cry of an animal as 
the line of sight of the virtual camera is moved, that is, 
for the purpose of obtaining three-dimensional sound 
effects in synchronization with three-dimensional image 
display. The coordinate data fed to the audio processing 
unit 1 2 also includes Z coordinate data representing the 
depth direction in addition to X coordinate data repre- 
senting the transverse direction of the screen and Y 
coordinate data representing the longitudinal direction 
of the screen. 

A sound source data memory 22 included in the 
external memory 20 is detachably connected to the 
audio processing unit 1 2, and a buffer memory for audio 
processing (hereinafter referred to as a "buffer mem- 
ory") 15 for temporarily storing sound source data is 
connected thereto through a data bus and an address 
bus. 

The sound source data memory 22 stores a large 
amount of sound source data used for one game exe- 
cuted by programs in the external memory 20 in the 
form of PCM data or AD-PCM data. 

The buffer memory 15 includes a buffer area 15a 
and a delay buffer area 15b (see Fig. 9 showing the 
embodiment as described in detail later). The buffer 
area 15a temporarily stores audio data in order to gen- 
erate a first audio signal which are not delayed in given 



unit time, which is hereinafter referred to as a non-delay 
buffer area. The delay buffer area 15b has a storage 
capacity corresponding to maximum delay time in order 
to generate a second audio signal which are delayed 
5 from the first audio signal by time corresponding to 
delay time operated on the basis of the distance and the 
direction between the coordinates of the sound generat- 
ing object and the coordinates of the virtual camera (or 
the hero character). 
10 In the present embodiment, if the maximum delay 
time is not less than one-fifth of the unit time, an unnat- 
ural impression is given when right and left sounds are 
heard, whereby the storage capacity of the delay buffer 
area 1 5b is selected as a storage capacity which is one- 
rs fifth of that of the non-delay buffer area 15a. For exam- 
ple, the non-delay buffer area 15a has 320 bytes as a 
storage capacity corresponding to the unit time, and the 
delay buffer area 15b has 64 bytes as a storage capac- 
ity corresponding to the maximum delay time. The unit 
20 processing time is determined by the relationship 
between a sampling frequency in a case where audio 
signals are sampled to produce audio data (for exam- 
ple, 32 KHz) and time corresponding to audio signals to 
be processed at one time (that is, time corresponding to 
25 the processing unit), and is set to 1/200 to 1/240 sec- 
onds in the present embodiment. The delay time of the 
second audio signal from the first audio signal is varia- 
bly controlled depending on a direction to the sound 
generating object as viewed from the camera (or the 
30 hero character) or an amount of change between direc- 
tions to the sound generating object as viewed from the 
camera in a case where the camera (or the hero charac- 
ter) is moved before and after the movement (or angles 
based on the directions). 
35 When the capacity of the buffer memory 1 5 is large, 
a start address to which a second sound source data is 
to be written may be variably controlled in the range of 0 
to 64 bytes depending on the delay time using a delay 
buffer area 15b' having a storage capacity which is the 
40 sum of the storage capacity of the non-delay buffer area 
15a and the storage capacity of the delay buffer area 
15b in place of the delay buffer area 15b (see Fig. 10 
showing the embodiment as described in detail later). 
The audio processing unit 12 executes operation 
45 processing on the basis of predetermined programs in 
accordance with the first and second coordinate data, 
finds a direction (or an angle) to the sound generating 
object as viewed from the camera (or the hero charac- 
ter), and determines on the basis of the direction an 
so amount of delay (or delay time) so as to have a relation- 
ship shown in a characteristic view of Fig. 6 as 
described later. 

Furthermore, the audio processing unit 12 finds the 
distance between the sound generating object and the 
55 camera (or the hero character) on the basis of the first 
and second coordinate data, and determines the sound 
volume on the basis of data representing the distance. 
The audio processing unit 12 reads out audio data cor- 
responding to the processing unit (320 bytes) out of the 
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audio data stored in the sound source data memory 22 
in a predetermined period and writes the audio data into 
the non<lelay buffer area 15a, finds an amount of 
change (an angle) between directions to the sound gen- 
erating object as viewed from the camera (or the hero 
character) on the basis of coordinate data respectively 
representing the position of the camera (or the hero 
character) and the positions of the sound generating 
object before and after the processing unit time, and 
finds delay time corresponding to the amount of change 
between the directions by operation processing. The 
delay time may be determined by previously setting 
data representing delay time in a table for each angle 
based on a direction and reading out corresponding 
data representing delay time from the table in place of 
the operation processing. A write area of the delay 
buffer area 15b is determined depending on the delay 
time thus determined. 

The audio processing unit 12 reads out the sound 
source data stored in the non-delay buffer area 15a and 
outputs the sound source data as a first sound source 
data, writes the first sound source data into addresses 
of the delay buffer area 15b corresponding to the delay 
time, and reads out the sound source data from the final 
address of the delay buffer area 1 5b and outputs the 
sound source data as a second sound source data 
which is delayed from the first sound source data by 
desired delay time. 

The first audio signal and the second audio signal 
do not respectively fixedly correspond to left and right 
speakers (or left and right sound generating bodies of a 
headphone). The first audio signal corresponds to a 
channel of audio signals which are not delayed, and the 
second audio signal correspond to a channel of audio 
signals which are delayed. The audio processing unit 12 
feeds the first audio data to a digital-to-analog conver- 
sion circuit for left 16a and feeds the second audio data 
to a digital-to-analog conversion circuit for right 16b 
when it judges that the sound generating object exists 
on the left side of the front of the camera (or the hero 
character). On the contrary, the audio processing unit 
12 feeds the first audio data to the digital-to-analog con- 
version circuit for right 16b and feeds the second audio 
data to the digital-to-analog conversion circuit for left 
16a when it judges that the sound generating object 
exists on the right side of the front of the camera (or the 
hero character). Specifically, the audio processing unit 
12 switches a combination of the digital-to-analog con- 
version circuits 16a and 16b (or 16b and 16a) for feed- 
ing the first and second audio data for each unit time 
depending on whether the sound generating object 
exists on the right side or the left side as viewed from 
the camera (or the hero character). 

The digital-to-analog conversion circuits 16a and 
16b subject the inputted audio data to digital-to-analog 
conversion, to generate audio signals, and feed the 
audio signals to corresponding filters 17a and 17b. The 
filters 1 7a and 1 7b respectively subject the left and right 
audio signals to interpolation processing, to waveform- 



shape the audio signals into smooth audio signals, and 
feed the wave-shaped audio signals to left and right 
speakers 42L and 42R provided in relation to the televi- 
sion 40 and/or feed the audio signals to a headphone 44 
5 through an earphone jack 43, thereby to generate 
sounds. 

The audio processing unit 12 further includes a 
sound volume control section 12a. The sound volume 
control section 12a finds the distance between the 

w sound generating object and the camera (or the hero 
character), and controls the sound volume depending 
on the found distance. For example, the sound volume 
control section 1 2 controls the sound volume so that the 
sound volume is increased if the camera (or the hero 

15 character) approaches the sound generating object to 
decrease the distance therebetween, while being 
decreased if the camera (or the hero character) moves 
farther apart from the sound generating object to 
increase the distance therebetween. Specifically, when 

20 the sound volume is so controlled as to be inversely pro- 
portional to the square of the distance between the 
camera (or the hero character) and the sound generat- 
ing object, it is possible to change the presence of a 
sound effect in correspondence to the change in three- 

25 dimensional image display with the movement of the 
camera (or the hero character). 

The above-mentioned operations are repeatedly 
performed for each unit time, whereby a sound effect for 
a game or BGM music (game music) for raising the 

30 atmosphere of the game is generated in synchroniza- 
tion with the change of an image. The audio processing 
unit 12 controls a difference between timing at which the 
first audio data is outputted and timing at which the sec- 
ond audio data is outputted (an amount of delay of one 

35 of the audio data from the other audio data) on the basis 
of a direction or an angle to the sound generating object 
as viewed from the camera (or the hero character), and 
controls the sound volume on the basis of the distance 
between the position of the camera (or the hero charac- 

40 ter) and the position of the sound generating object. As 
a result, audios or sounds heard from the left and right 
speakers 42L and 42 R or the headphone 44 are also 
changed in a three-dimensional manner in synchroniza- 
tion with the change in three-dimensional image display 

45 corresponding to the movement of the line of sight to the 
sound generating object as viewed from the camera (or 
the hero character). 

More preferably, when the amounts of delay of right 
and left sounds are controlled in order to further really 

so represent three-dimensional sound effects, the sound 
volume may be controlled not in the maximum range 
from the minimum sound volume to the maximum sound 
volume but in a range obtained by restricting or sup- 
pressing the maximum range. Further, when a sound 

55 insulating object for insulating sounds (for example, a 
building such as a house or a wall, or a large moving 
object such as a ship or an airplane) exists between the 
camera (or the hero character) and the sound generat- 
ing object upon movement of the camera (or the hero 
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character) or arrival of the sound insulating object, 
sound effects further conforming to reality are obtained 
if the sound volume control section 12a so controls the 
sound volume that its level becomes extremely lower. 

In the block diagram of Fig. 1, the relationship s 
between the delay time and the change in the sound 
volume depending on the presence or absence of the 
delays of the first and second audio signals will be spe- 
cifically described. Operations in this case are achieved 
by the execution of programs stored in the image 10 
data/program data memory 21 by the image processing 
unit 11 and the audio processing unit 12. For that pur- 
pose, it is assumed that programs for carrying out con- 
trol as shown in characteristic views of Figs. 3 to 6 is 
previously stored in the memory 21. Further, a mode in is 
which the sound volume is controlled under the condi- 
tion that there is no amount of delay (zero delay) (see 
Fig. 3), a mode in which the sound volume is controlled 
under the condition that there is an amount of delay 
(see Fig. 4), a mode in which the sound volume is con- 20 
trolled depending on the distance between sound gen- 
erating objects in a case where there is an amount of 
delay (see Fig. 5), and a mode in which the amount of 
delay is controlled by the positional relationship (the 
direction) between the sound generating object and the 25 
camera (or the hero character) (see Fig. 6) are executed 
individually or in a suitable combination on the basis of 
the programs. 

A specific method for realizing control of the sound 
volume and/or control of the amount of delay as shown 30 
in the characteristic views of Figs. 3 to 6 may be a 
method of changing the relationship between the dis- 
tance/direction and the sound volume into an equation 
so as to be a characteristic waveform shown in Figs. 3 
to 6 and storing the equation, operating the dis- 35 
tance/direction and/or the sound volume using the 
stored equation for each control, or setting a sound vol- 
ume value of the waveform in a table for each left and 
right unit distance centered around the position of the 
camera (or the hero character) and reading out the 40 
sound volume value considering data representing the 
distance as an address. 

Referring now to Figs. 2 and 3, description is made 
of the mode in which the sound volume is controlled 
under the condition that there is no amount of delay. For 45 
example, when the distance between the sound gener- 
ating object and the camera (or the hero character) is 
constant in a game scene where it is preferable not to 
cause respective amounts of delay of left and right 
audio signals, the sound volume of the left audio signal so 
is set to the maximum amount of change and the sound 
volume of the right audio signal is set to zero when the 
sound generating object exists on the left side (at an 
angle of 0 ) as viewed from the camera (or the hero 
character) (see Fig. 3). As the sound generating object 55 
is so moved rightward as to draw a semicircle spaced 
apart by a predetermined distance r around the camera 
(or the hero character) as shown in Fig. 2, the sound 
volume of the right audio signal is gradually increased 



and the sound volume of the left audio signal is gradu- 
ally decreased, as indicated by the characteristic view 
of Fig. 3. When the sound generating object reaches the 
front of the camera (or the hero character) (a position at 
an angle of 90 from the left side), the sound volumes of 
the left and right audio signals are made equal to each 
other. Further, when the sound generating object is 
moved rightward to reach a position on the right side of 
the camera (or the hero character) (a position at an 
angle of 1 80 from the left side), the sound volume of the 
left audio signal is set to zero and the sound volume of 
the right audio signal is set to the maximum amount of 
change. 

Even in a case where the sound generating object 
is fixed, and a direction of the camera (or the hero char- 
acter) is changed, if the relative positional relationship 
between the sound generating object and the camera 
(or the hero character) is the same as the relationship 
shown in Figs. 2 and 3, the sound volumes of the left 
and right audio signals may be similarly controlled. 
When the sound generating object is fixed, and the 
direction of the camera (or the hero character) is 
changed, the same applies to the case shown in the 
characteristic views of Figs. 4 to 6. It is preferable that 
the increase or decrease of the sound volumes is con- 
trolled by multiplexing the change in characteristics 
shown in Fig. 3 by a correction value so that the sound 
volumes are inversely proportional to a value obtained 
by multiplying by a given coefficient the distance 
between the sound generating object and the camera 
(or the hero character). 

Referring now to Figs. 2 and 4, description is now 
made of the mode in which the sound volume is control- 
led under the condition that there is an amount of delay. 
In a game scene where the sound volume is controlled, 
combined with three<limensional display, that is, a 
game scene where the sound volume is controlled in 
relation to amounts of delay of the left and right audio 
signals, when the sound generating object is on the left 
side as viewed from the camera (or the hero character), 
the sound volume of the left audio signal is set to the 
maximum amount of change and the sound volume of 
the right audio signal is set to approximately one-half of 
the maximum amount of change (see Fig. 4). The rea- 
son why the minimum sound volume is set to not zero 
but one-half of the maximum sound volume (or the max- 
imum amount of change) in a case where there is a 
delay is that three-dimensional sound effects are 
obtained even if the sound volumes of the left and right 
audio signals are not made different from each other in 
the maximum range by delaying the left and right audio 
signals. The sound volume of the right audio signal is 
gradually increased and the sound volume of the left 
audio signal is gradually decreased, as shown in Fig. 4, 
as the sound generating object is so moved rightward 
as to draw a semicircle spaced apart by a predeter- 
mined distance r around the camera (or the hero char- 
acter) as shown in Fig. 2. When the sound generating 
object reaches the front of the camera (or the hero char- 
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acter), the sound volumes of the left and right audio sig- 
nals are made equal to each other. Further, when the 
sound generating object is moved rightward to reach a 
position on the right side of the camera (or the hero 
character), the sound volume of the right audio signal is 
set to the maximum amount of change and the sound 
volume of the left audio signal is set to one-half of the 
maximum amount of change. 

Referring now to Figs. 2 and 5, description is made 
of the relationship between the distance and the sound 
volume in the mode in which the sound volume is con- 
trolled under the condition that there is an amount of 
delay as shown in Fig. 4. When the sound generating 
object exists within the range of a radius r from the cam- 
era (or the hero character), the sound volume is 
changed in the range between the maximum amount of 
change and the minimum amount of change depending 
on the direction or the position of the sound generating 
object as viewed from the camera (or the hero charac- 
ter). The reason for this is that if the sound generating 
object is positioned within the range of a given short dis- 
tance r, the sound volume hardly changes with the 
change in the distance. In Fig. 2, when the sound gen- 
erating object exists at the front of the camera (or the 
hero character), the sound volumes of the left and right 
audio signals take values intermediate between the 
maximum amount of change and the minimum amount 
of change. On the other hand, as the sound generating 
object moves farther apart in the radial direction, the 
sound volume is so changed as to be exponentially 
decreased. When the sound generating object moves 
farther apart from the camera (or the hero character) by 
not less than a predetermined distance, the sound vol- 
ume is set to zero. 

Referring now to Figs. 2 and 6, description is made 
of the mode in which the amount of delay is controlled in 
relation to the positional relationship between the sound 
generating object and the camera (or the hero charac- 
ter). When the camera (or the hero character) faces the 
front, and the sound generating object exists at the 
front, it is necessary that there is no amount of delay 
between the left audio signal and the right audio signal. 
If the amount of delay is so controlled as to be changed 
even when the camera (or the hero character) is only 
slightly moved leftward or rightward, an uncomfortable 
impression is given in relation to image display. As 
shown in Fig. 6, therefore, the amount of delay is so 
controlled as not to be changed in the range of predeter- 
mined distances on the left and right sides from the 
camera (or the hero character). Specifically, when the 
sound generating object is on the left side as viewed 
from the camera (or the hero character), the amount of 
delay of the left audio signal is set to zero and the 
amount of delay of the right audio signal is set to the 
maximum amount of delay. The amount of delay of the 
right audio signal is decreased as the sound generating 
object is so moved rightward as to draw a semicircle 
spaced apart by a predetermined distance r around the 
camera (or the hero character), the amounts of delay of 



the left and right audio signals are set to zero in the 
range of predetermined distances on the left and right 
sides from the center, and the amount of delay of the left 
audio signal is gradually increased with the amount of 
5 delay of the right audio signal set to zero as the sound 
generating object is moved rightward from a position 
spaced apart from the center by a predetermined dis- 
tance. 

Fig. 7 is a block diagram showing the construction 

10 of a sound generator according to a second embodi- 
ment of the present invention. The sound generator 
according to the present embodiment differs from the 
sound generator according to the first embodiment (see 
Fig. 1) in the following points. First, the main body of the 

15 processor 10 is replaced with a video game set 50. Fur- 
ther, the image processing unit 1 1 is constituted by a 
main CPU (M-CPU) 51 and two risk CPUs (R-CPUs) 52 
and 53. The image memory 13 and the buffer memory 
15 are constituted by a working RAM (W-RAM) 55 hav- 

20 ing a large storage capacity. Further, the audio process- 
ing unit 12 is constituted by one R-CPU 53. That is, the 
R-CPU 53 is used for both image processing and audio 
processing. As described in the foregoing, the image 
processing unit 1 1 is constituted by the three CPUs (the 

25 M-CPU 51 , the R-CPU 52, and the R-CPU 53). The rea- 
son why the audio processing unit 12 is constituted by 
one R-CPU 53 is that the audio processing can be per- 
formed in a shorter time period than the image process- 
ing. The reason why the image memory 13 and the 

30 buffer memory 15 are constituted by one W-RAM 55 
having a large capacity (for example, 4 megabytes) is 
that the degree of freedom of the assignment of a mem- 
ory space is increased so that the distribution of time 
periods used for the image processing and the audio 

35 processing can be set flexibly depending on the pur- 
pose of use. 

In order to control input/output of a plurality of conr- 
tollers 30, a controller control circuit 56 is provided. Fur- 
ther, an input/output control circuit (I/O) 57 is provided in 

40 order to control data transfer or input/output between 
the M-CPU 51, the R-CPU 52, the R-CPU 53 and the 
W-RAM 55 and an external memory 20, the controller 
control circuit 56 and the like. Further, a connector for a 
cartridge 581 is provided in order to detachably mount 

45 the external memory 20, connectors for a controller 582 
and 583 are provided in order to detachably connect the 
controllers 30 to the controller control circuit 56, connec- 
tors for audio 584 and 585 are provided in order to con- 
nect filters 1 7a and 1 7b to speakers 42L and 42R or a 

so headphone 44 of a television 40, and a connector for an 
image signal 586 is provided in order to connect an 
image signal generation circuit 14 to a display 41 . In the 
following description, the various types of connectors 
581 to 586 are merely referred to as "connectors". Since 

55 the other construction is the same as that shown in Fig. 
1 and hence, the same sections are assigned the same 
reference numerals, and the detailed description thereof 
is omitted. 

As the external memory 20 serving as an informa- 
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tion storage medium, the ROM cartridge may be 
replaced with various recording or storage media such 
as a CD-ROM optically storing data, a magnetic disk 
magnetically storing data, and a magneto-optic disk. In 
the case, a recording or reproducing device corre- 
sponding to the type of information storage medium 
must be provided to read out a certain amount of data 
from the information storage medium and temporarily 
store the data in a memory space in a part of the W- 
RAM 55 in the video game set 50. 

Fig. 8 is a diagram illustrating the memory space of 
the W-RAM 55. In the W-RAM 55, the memory space is 
assigned on the basis of programs previously stored in 
an image/program memory 21 in the external memory 
20. One example will be described as follows. The W- 
RAM 55 includes a program area 55a to and in which 
parts of the programs stored in the image/program 
memory 21 are transferred and stored, a frame memory 
area 55b storing image data corresponding to one 
frame (corresponding to an image frame), a Z buffer 
area 55c storing depth coordinate data for each object 
or character, an image data area 55d, a sound memory 
area 15 for audio data processing, a control pad data 
storage area 55e, and a working memory area 55f. 

The sound memory area 15 corresponds to the 
buffer memory shown in Fig. 1, and includes a non- 
delay buffer area 15a, a delay buffer area 15b, and an 
object coordinate data storage area 1 5c, as illustrated in 
Fig. 8. The non-delay buffer area 15a and the delay 
buffer area 15b have memory structures as specifically 
shown in Fig. 9, and data are written/read out as shown. 
Specifically, when audio data for each processing unit 
are written to/read out of the non<lelay buffer area 15a, 
data previously written are shifted to the succeeding 
address one byte at a time every time audio data com- 
posed of one byte are written to the first address. Spe- 
cifically, audio data are written in a first-in first-out 
manner, and are read out and outputted as audio data 
(read data) on a first channel (CH1) from the final 
address. The audio data on the first channel read out 
from the non-delay buffer area 15a are written as they 
are to the first address of the delay buffer area 15b. At 
this time, audio data at an address designated by an 
address register (an internal register included in the R- 
CPU 53) 15d storing a read address of the delay buffer 
area 1 5b are read out and outputted as audio data on a 
second channel (CH2). The number of addresses 
(bytes) from the first address of the delay buffer area 
1 5b to the read address designated by the address reg- 
ister 15d becomes delay time of sounds on the second 
channel from sounds on the first channel. In this case, 
however, a capacity corresponding to the maximum 
delay time is sufficient as the storage capacity of the 
delay buffer area 15b. However, a read control program 
in the R-CPU 53 becomes complicated. On the other 
hand, when the sounds on the second channel need not 
be delayed, as compared with the sounds on the first 
channel, the read address designated by the address 
register 15d may be set to the same first address as a 



write address of the delay buffer area 15b. Therefore, 
the delay time can be changed in a wide range. 

Although in the embodiment shown in Fig. 9, 
description was now made of a case where the sound 

5 memory area (or the buffer memory for audio process- 
ing) 15 is efficiently used in the range of the minimum 
storage capacity, a memory structure as shown in Fig. 
10 may be used rf a sound memory area 15 having a 
large storage capacity can be prepared. In Fig. 10, the 

10 storage capacity of a delay buffer area 15b* is set to a 
capacity which is the sum of the storage capacities of a 
non-delay buffer area 15a and a delay buffer area 15b 
(15b' = 15a + 15b). Audio data on the first channel 
(CH1) are written and read out in the same manner as 

15 that shown in Fig. 9, while audio data on the second 
channel (CH2) are written and read out in the following 
manner. Specifically, a value of a write address corre- 
sponding to delay time is written into an address regis- 
ter 15e contained in the R-CPU 53 by the R-CPU 53. 

20 The same audio data as those on the first channel are 
simultaneously written at addresses designated by the 
address register 15e. The audio data are read out from 
the delay buffer area 1 5b' (sounds on the second chan- 
nel are generated), starting at the final address of the 

25 delay buffer area 15b'. Consequently, timing at which 
the audio data on the second channel are read out is 
delayed from timing at which the same audio data on 
the first channel are read out (sounds on the first chan- 
nel are generated) by a time period proportional to the 

30 number of addresses obtained by subtracting the value 
of the address in the address register 15e from the 
number of addresses corresponding to the delay buffer 
area 1 5b. 

On the other hand, the coordinate data storage 

35 area 1 5c is an area storing coordinate data of an sound 
generating object or the like which is displayed on a 
screen. For example, the coordinate data storage area 
15c sotres coordinate data of an object 1 generating 
sounds such as an enemy character or a waterfall as 

40 coordinate data of the object 1. The coordinate data 
storage area 15c stores coordinate data of an object 2 
such as a camera (or a hero character) whose line of 
sight is moved to see the object 1 by an operator oper- 
ating the controllers 30 as coordinate data of the object 

45 2. When sounds are generated from the object 1 , the M- 
CPU 51 calculates a direction to the object 1 as viewed 
from the object 2 and the distance therebetween on the 
basis of the coordinate data of the object 1 and the coor- 
dinate data of the object 2. Further, a program previ- 

so ously so set as to produce three-dimensional sound 
effects most suitable for three-dimensional image dis- 
play out of the characteristic views of Figs. 3 to 6 is exe- 
cuted on the basis of data representing the direction 
and the distance, to generate data representing the 

55 delay time, the sound volume and the type of sound, 
and feed to the R-CPU 53 the data representing the 
delay time, the sound volume and the type of sound. 
The R-CPU 53 carries out writing/reading control 
described while referring to Fig. 9 or 10, to control the 
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delay time of the sounds generated from each of the 
first channel and the second channel and control the 
sound volume thereof. 

Referring to Fig. 1 1, description is now made ol the 
schematic flow of the game. When a power switch of the 
game set 50 is turned on, the following operations are 
performed by the M-CPU 51 and/or the R-CPUs 52 and 
53. Specifically, in the step S10, a menu panel for initial- 
ization of the game is displayed. In the step S1 1 , the M- 
CPU 51 judges whether or not a decision button (for 
example, a start button) of the controller 30 is 
depressed. When it is judged that the start button is 
depressed, the program proceeds to the step S12. 
When it is judged that the start button is not depressed, 
the program is returned to the step S10. In the step S12, 
the M-CPU 51 , the R-CPU 52 and the R-CPU 53 per- 
form image display processing for the progress of the 
game on the basis of program data and image data 
which are stored in the external memory 20. In the step 

513, an object 1 which is, for example, an enemy char- 
acter generating sounds and an object 2 which is a hero 
character operated as an operator are displayed on the 
display 41 as one scene on a game screen. In the step 

514, it is judged whether or not the condition that the 
object 1 should generate sounds is satisfied on the 
basis of a game program. When it is judged that the 
condition that the object 1 should generate sounds is 
satisfied, the program proceeds to the step S15. In the 
step S15, processing for audio output (processing in a 
subroutine described in detail while referring to Fig. 12 
as described later) is performed. On the other hand, in 
the step S14, when it is judged that the condition that 
the object 1 should generate sounds is not satisfied, the 
program is returned to the step S12. In the step S12, 
image display processing corresponding to the 
progress of the game is continued. 

Referring now to Fig. 12, description is made of 
operations in the subroutine of the audio output 
processing. First, in the step S20, the M-CPU 51 reads 
out the coordinate data of the object 1 and the coordi- 
nate data of the object 2 on the basis of the coordinate 
data storage area 15c stored in the sound memory area 
15 shown in Fig. 8. In the step S21 , the direction to the 
object 1 as viewed from the object 2 and the distance 
therebetween are then calculated on the basis of the 
coordinate data of the object 1 and the coordinate data 
of the object 2. In the step S22, as described while refer- 
ring to Figs. 3 to 6, the amount of delay is calculated on 
the basis of the direction in which the object 1 exists, 
and the sound volume is calculated on the basis of the 
direction and the distance. In the step S23, data repre- 
senting the sound volume and the amount of delay 
which are found by the calculation and data represent- 
ing the type of sound are then transferred to the R-CPU 
53. In the step S24. the R-CPU 53 then reads out a 
audio frame (audio data for each processing unit) from 
the sound source data memory 22 in the external mem- 
ory 20 on the basis of the data representing the type of 
sound. In the step S25, the audio frame read out from 



the sound source data memory 22 is then written into 
the non-delay buffer area 15a shown in Fig. 9. In the 
step S26, the sound volume of the audio frame is then 
controlled on the basis of the data representing the 

5 sound volume. Specifically, the sound volume is sepa- 
rately controlled on the left and right sides in corre- 
spondence to the direction of the object 1 as indicated 
by L and R in Fig. 3 or 4, and the sound volume is con- 
trolled in correspondence to the distance to the object 1 

10 as shown in Fig. 5. In the step S27, data of the audio 
frame whose sound volume is controlled is then read 
out from the final address of the non-delay buffer area 
15a. In the step S28, the audio frame read out is then 
outputted as audio data on a first channel. In the step 

/ 5 S29, the R-CPU 53 then judges whether or not there is 
a delay on the basis of the data representing the 
amount of delay. When it is judged that there is no delay, 
the program proceeds to the step S30. In the step S30, 
the R-CPU 53 outputs the audio frame read out from the 

20 non-delay buffer area 15a as audio data on a second 
channel. 

On the other hand, when it is judged that there is a 
delay, the program proceeds to the step S31 . In the step 
S31 , the R-CPU 53 writes the audio frame read out from 

25 the non-delay buffer area 1 5a into the delay buffer area 
15b. In the step S32, the R-CPU 53 then judges 
whether or not the previous amount of delay is the same 
as the current amount of delay. When it is judged that 
they are not the same, the program proceeds to the step 

30 S33. In the step S33, the R-CPU 53 performs re-sam- 
pling processing of the audio frame. Specifically, when 
the current amount of delay is smaller than the previous 
amount of delay, the R-CPU 53 compresses the audio 
frame by the amount of change in the amount of delay. 

35 When the current amount of delay is larger than the pre- 
vious amount of delay, the R-CPU 53 expands the audio 
frame by the amount of change in the amount of delay. 
In the step S33, the re-sampling processing of the audio 
frame is then performed, after which the program pro- 

40 ceeds to the step S34. 

On the other hand, when it is judged in the step S32 
that the current amount of delay is the same as the pre- 
vious amount of delay, the program proceeds to the step 
S34. Further, when the current amount of delay corre- 

45 sponds to the amount of delay at the time of starting 
sound generation, the current amount of delay cannot 
be compared with the previous amount of delay, 
whereby the program proceeds to the step S34 consid- 
ering that they are the same. In the step S34, the R- 

50 CPU 53 designates a value of an address in the 
address register 15d on the basis of the amount of 
delay. In the step S35, the R-CPU 53 then reads out the 
audio frame written into the delay buffer area 15b from 
the designated address, and outputs the audio frame as 

55 audio data on the second channel. 

Referring now to a timing chart shown in Fig. 13, 14 
or 15, description is made of specific operations in the 
foregoing steps S29 to S35. 

In the foregoing step S29, when it is judged that 
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there is no amount of delay, audio data are outputted as 
shown in the timing chart of Fig. 13. In this example, the 
length of time of audio data for each processing unit (a 
audio frame) is set to 1/240 seconds (approximately 4 
ms), for example. When the audio frame 1 is outputted, 5 
delay processing need not be performed by the first 
channel and the second channel, whereby the R-CPU 
53 simultaneously outputs the audio frame 1 to the first 
channel and the second channel. This is also repeated 
with respect to data corresponding to the audio frame 2 10 
and the subsequent data. 

When it is judged in the step S29 that there is an 
amount of delay, and it is judged in the subsequent step 
S32 that the previous amount of delay and the current 
amount of delay are the same (that is, the amount of 15 
delay is constant), the audio data are outputted as in the 
timing chart of Fig. 14. In this example, it is assumed 
that a time period corresponding to the amount of delay 
is variably controlled in the range of 0 to 1/1000 sec- 
onds (1 ms), and the second channel has a constant 20 
amount of delay of 1/2000 seconds (0.5 ms). When the 
sound generation is started, the R-CPU 53 outputs the 
audio frame 1 on the side of the second channel after a 
delay of 0.5 ms from the audio frame 1 on the side of the 
first channel in order to form a portion where there is no 25 
sound for only a time period corresponding to the 
amount of delay on the side of the second channel. 
When the audio frame 1 on the side of the first channel 
has been outputted, a portion, which corresponds to 0.5 
ms, of the audio frame 1 remains on the side of the sec- 30 
ond channel. When the audio frame 2 is outputted on 
the side of the first channel, the portion, which corre- 
sponds to the remaining 0.5 ms, of the audio frame 1 is 
outputted on the side of the second channel, after which 
the second frame 2 is outputted. This is repeated with 35 
respect to data corresponding to the audio frame 3 and 
the subsequent data. Consequently, the audio data are 
always outputted on the side of the second channel 
after a delay of 0.5 ms from those on the side of the first 
channel. This operation is repeated until the amount of 40 
delay is changed or the sound generation is terminated, 
whereby it is possible to achieve audio output process- 
ing in a case where the amount of delay is constant. 

When it is judged in the step S29 that there is an 
amount of delay, and it is judged in the subsequent step 45 
S32 that the previous amount of delay and the current 
amount of delay are not the same (that is, the amount of 
delay is variable), audio data are outputted as shown in 
the timing chart of Fig. 15. For example, consider a case 
where when the audio frame 1 and the audio frame 2 on so 
the side of the first channel are outputted, the audio 
frame 1 and the audio frame 2 on the side of the second 
channel are respectively outputted after delays of 0.5 
ms and 0.25 ms (that is. the amount of delay is changed 
from 0.5 ms to 0.25 ms). The amount of change in the 55 
amount of delay at this time is decreased by 0.25 ms. 
Therefore, the R-CPU 53 compresses the audio frame 1 
on the side of the second channel by 0.25 ms corre- 
sponding to the amount of change in the amount of 



delay (that is, the audio frame 1 is re-sampled from 4 ms 
to 3.75 ms). When the audio frame 2 on the side of the 
first channel is outputted, a portion, which corresponds 
to 0.25 ms, of the audio frame 1 remains on the side of 
the second channel, whereby it is possible to achieve 
the change in the amount of delay from 0.5 ms to 0.25 
ms. When the audio frame 3 on the side of the first 
channel is then outputted, the amount of delay on the 
side of the second channel is changed to 0.75 ms, 
whereby the R-CPU 53 expands the audio frame 2 on 
the side of the second channel by 0.5 ms corresponding 
to the amount of change in the amount of delay (that is, 
the audio frame 2 is re-sampled from 4 ms to 4.5 ms). 
When the audio frame 3 on the side of the first channel 
is outputted, a portion, which corresponds to 0.75 ms, of 
the audio frame 2 remains on the side of the second 
channel, whereby it is possible to achieve the change in 
the amount of delay from 0.25 ms to 0.75 ms. Specifi- 
cally, letting n be the length of time of the audio frame, 
db be the previous amount of delay, and df be the next 
amount of delay, the audio output processing in a case 
where the amount of delay is variable can be achieved 
by re-sampling processing of the audio frame from (n) to 
(n + df -db). 

Consequently, when the amount of delay is varia- 
ble, the production of noise due to the overlapping and 
the drop of data can be prevented by re-sampling 
processing of the audio data. 

Although the present invention has been described 
and illustrated in detail, it is clearly understood that the 
same is by way of illustration and example only and is 
not to be taken by way of limitation, the spirit and scope 
of the present invention being limited only by the terms 
of the appended claims. 

Claims 

1. A sound generator for generating, in an image dis- 
play device for displaying a three<Jimensional 
image as viewed from a virtual camera, sounds 
having a spatial extent corresponding to the three- 
dimensional image, comprising: 

first sound generation means for generating a 
first sound; 

second sound generation means for generating 
a second sound; 

sound source data storage means for digitally 
storing sound source data; 
temporary storage means for temporarily stor- 
ing the sound source data read out from said 
sound source data storage means; 
delay time calculation means for calculating, 
when said image display device displays a first 
display object so defined as to generate 
sounds, delay time on the basis of a direction to 
the first display object as viewed from the posi- 
tion of a predetermined viewpoint; 
audio processing means for reading out the 
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sound source data corresponding to said first 
display object from said sound source data 
storage means for each unit time, storing the 
sound source data in said temporary storage 
means and reading out the sound source data 5 
as a first sound source data, and reading out 
the sound source data stored in the temporary 
storage means as a second sound source data 
at timing delayed by the delay time calculated 
by said delay time calculation means from tim- w 
ing at which said first sound source data is read 
out; 

first digital-to-analog conversion means for 
converting the first sound source data read out 
from said temporary storage means into an 75 
analog audio signal and feeding the analog 
audio signal to said first sound generation 
means; and 

second digital-to-analog conversion means for 
converting the second sound source data read 20 
out from said temporary storage means into an 
analog audio signal and feeding the analog 
audio signal to said second sound generation 
means. 

25 

2. The sound generator according to claim 1 , wherein 

said delay time calculation means calculates 
the delay time on the basis of a direction to said first 
display object as viewed from said virtual camera. 

30 

3. The sound generator according to claim 2, wherein 

said audio processing means comprises 
sound volume control means for individually con- 
trolling the sound volumes of said first and second 
sounds on the basis of the distance between said 35 
first display object and said virtual camera. 

4. The sound generator according to claim 3, wherein 

said sound volume control means controls 
the respective sound volumes of said first and sec- 40 
ond sounds so as to be inversely proportional to a 
predetermined coefficient times the distance 
between said first display object and said virtual 
camera. 

45 

5. The sound generator according to claim 3, wherein 

said sound volume control means sup- 
presses the control ranges of the sound volumes in 
inverse proportion to a predetermined coefficient 
times the distance between said first display object so 
and said virtual camera. 

6. The sound generator according to claim 1 , wherein 

there is provided controller in relation to said 
image display device, and ss 

said delay time calculation means calculates 
the delay time on the basis of a direction to the first 
display object as viewed from the second display 
object when said image display device displays 



such a second display object that its display posi- 
tion is changed in response to an operation of said 
controller by a player, and the line of sight of said 
virtual camera is moved in synchronization with its 
movement in addition to said first display object. 

7. The sound generator according to claim 6, wherein 

said audio processing means comprises 
sound volume control means for individually con- 
trolling the sound volumes of said first and second 
sounds on the basis of the distance between said 
first display object and said second display object. 

8. The sound generator according to claim 7, wherein 

said sound volume control means controls 
the respective sound volumes of said first and sec- 
ond sounds so as to be inversely proportional to a 
predetermined coefficient times the distance 
between said first display object and said second 
display object. 

9. The sound generator according to claim 7, wherein 

said sound volume control means sup- 
presses the control ranges of the sound volumes in 
inverse proportion to a predetermined coefficient 
times the distance between said first display object 
and said second display object. 

10. The sound generator according to claim 1 , wherein 

said temporary storage means comprises 

first temporary storage means capable of stor- 
ing the sound source data read out for each 
unit time from said sound source data storage 
means, and having a capacity for storing the 
amount of the sound source data which corre- 
sponds to at least the unit time, and 
second temporary storage means capable of 
storing the sound source data read out for each 
unit time from said sound source data storage 
means, and having a capacity for storing the 
amount of the sound source data which is 
larger by an amount corresponding to predeter- 
mined maximum delay time than that stored in 
the first temporary storage means, and 
said audio processing means comprises 
writing the sound source data read out from 
said sound source data storage means for 
each unit time into said first temporary storage 
means, and then reading out the written sound 
source data as a first sound source data, and 
writing the sound source data read out from 
said sound source data storage means for 
each unit time into the second temporary stor- 
age means with a write address changed 
depending on the delay time calculated by said 
delay time calculation means, and then reading 
out the written sound source data as a second 
sound source data, to delay the second sound 
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source data from the first sound source data by 
desired time in the range of said maximum 
delay time. 

1 1 . The sound generator according to claim 1 , wherein 5 
said temporary storage means comprises 

first temporary storage means capable of stor- 
ing the sound source data read out for each 
unit time from said sound source data storage io 
means, and having a capacity for storing the 
amount of the sound source data which corre- 
sponds to at least the unit time, and 
second temporary storage means capable of 
storing the sound source data read out from 75 
said first temporary storage means, and having 
a capacity for storing the amount of the sound 
source data which corresponds to predeter- 
mined maximum delay time, and said audio 
processing means comprises 20 
writing the sound source data read out from 
said sound source data storage means for 
each unit time into said first temporary storage 
means, and then reading out the written sound 
source data as a first sound source data, and 25 
writing the first sound source data read out 
from said first temporary storage means into 
said second temporary storage means, and 
then reading out the first sound source data as 
a second sound source data with a read 30 
address changed depending on the delay time 
calculated by said delay time calculation 
means, to delay the second sound source data 
from the first sound source data by desired 
time in the range of said maximum delay time. 35 
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(57) Digital sound source data is stored in a sound 
source data memory 22. When a first display object (an 
enemy character, a waterfall, or the like) so defined as 
to generate a sound is displayed in a three-dimensional 
manner on a display screen 41 of a television 40, a audio 
processing unit 12 reads out the corresponding sound 
source data from the sound source data memory 22, to 
produce first and second sound source data. The first 
and second sound source data are converted into ana- 
log audio signals by digital-to-analog converters 16a 
and 1 6b, and are then fed to left and right speakers 42L 
and 42R. At this time, the audio processing unit 12 cal- 
culates delay time on the basis of a direction to the first 
display object as viewed from a virtual camera (or a hero 
character), and changes delay time of the second sound 
source data from the first sound source data. Further, 
the audio processing unit 12 individually controls the 
sound volume levels of the first and second sound 
source data depending on the distance between the first 
display object and the virtual camera (or the hero char- 
acter). Consequently, sounds having a spatial extent 
corresponding to the change of a three-dimensional im- 
age can be respectively generated from the left and right 
speakers 42L and 42R. 
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